Probability-generating function

In probability theory, the probability-generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability-generating functions are often employed for their succinct description of the sequence of probabilities Pr(X = i) in the probability mass function for a random variable X, and to make available the well-developed theory of power series with non-negative coefficients.

Contents

Definition

Univariate case

If X is a discrete random variable taking values in the non-negative integers {0,1, ...}, then the probability-generating function of X is defined as [1]

G(z) = \operatorname{E} (z^X) = \sum_{x=0}^{\infty}p(x)z^x,

where p is the probability mass function of X. Note that the subscripted notations GX and pX are often used to emphasize that these pertain to a particular random variable X, and to its distribution. The power series converges absolutely at least for all complex numbers z with |z| ≤ 1; in many examples the radius of convergence is larger.

Multivariate case

If X = (X1,...,Xd ) is a discrete random variable taking values in the d-dimensional non-negative integer lattice {0,1, ...}d, then the probability-generating function of X is defined as

G(z) = G(z_1,\ldots,z_d)=\operatorname{E}\bigl (z_1^{X_1}\cdots z_d^{X_d}\bigr) = \sum_{x_1,\ldots,x_d=0}^{\infty}p(x_1,\ldots,x_d)z_1^{x_1}\cdots z_d^{x_d},

where p is the probability mass function of X. The power series converges absolutely at least for all complex vectors z = (z1,...,zd ) ∈ ℂd with max{|z1|,...,|zd |} ≤ 1.

Properties

Power series

Probability-generating functions obey all the rules of power series with non-negative coefficients. In particular, G(1) = 1, where G(1) = limz→1G(z) from below, since the probabilities must sum to one. So the radius of convergence of any probability-generating function must be at least 1, by Abel's theorem for power series with non-negative coefficients.

Probabilities and expectations

The following properties allow the derivation of various basic quantities related to X:

1. The probability mass function of X is recovered by taking derivatives of G

  p(k) = \operatorname{Pr}(X = k) = \frac{G^{(k)}(0)}{k!}.

2. It follows from Property 1 that if random variables X and Y have probability generating functions that are equal, GX = GY, then pX = pY. That is, if X and Y have identical probability-generating functions, then they have identical distributions.

3. The normalization of the probability density function can be expressed in terms of the generating function by

\operatorname{E}(1)=G(1^-)=\sum_{i=0}^\infty f(i)=1.

The expectation of X is given by

 \operatorname{E}\left(X\right) = G'(1^-).

More generally, the kth factorial moment, E(X(X − 1) ... (Xk + 1)), of X is given by

\textrm{E}\left(\frac{X!}{(X-k)!}\right) = G^{(k)}(1^-), \quad k \geq 0.

So the variance of X is given by

\operatorname{Var}(X)=G''(1^-) %2B G'(1^-) - \left [G'(1^-)\right ]^2.

4.GX(e^{t}) = MX(t) where X is a random variable, G(t) is the probability generating function and M(t) is the moment-generating function.

Functions of independent random variables

Probability-generating functions are particularly useful for dealing with functions of independent random variables. For example:

S_n = \sum_{i=1}^n a_i X_i,
where the ai are constants, then the probability-generating function is given by
G_{S_n}(z) = \operatorname{E}(z^{S_n}) = \operatorname{E}(z^{\sum_{i=1}^n a_i X_i,}) = G_{X_1}(z^{a_1})G_{X_2}(z^{a_2})\cdots G_{X_n}(z^{a_n}).
For example, if
S_n = \sum_{i=1}^n X_i,
then the probability-generating function, GSn(z), is given by
G_{S_n}(z) = G_{X_1}(z)G_{X_2}(z)\cdots G_{X_n}(z).
It also follows that the probability-generating function of the difference of two independent random variables S = X1X2 is
G_S(z) = G_{X_1}(z)G_{X_2}(1/z).
G_{S_N}(z) = G_N(G_X(z)).
This can be seen, using the law of total expectation, as follows:
 G_{S_N}(z) = \operatorname{E}(z^{S_N}) = \operatorname{E}(z^{\sum_{i=1}^N X_i}) = \operatorname{E}\big(\operatorname{E}(z^{\sum_{i=1}^N X_i}| N) \big) = \operatorname{E}\big( (G_X(z))^N\big) =G_N(G_X(z)).
This last fact is useful in the study of Galton–Watson processes.
G_{S_N}(z) = \sum_{i \ge 1} f_i \prod_{k=1}^i G_{X_i}(z).
For identically distributed Xi this simplifies to the identity stated before. The general case is sometimes useful to obtain a decomposition of SN by means of generating functions.

Examples

G(z) = \left(z^c\right).
G(z) = \left[(1-p) %2B pz\right]^n.
Note that this is the n-fold product of the probability-generating function of a Bernoulli random variable with parameter p.
G(z) = \left(\frac{pz}{1 - (1-p)z}\right)^r.
Note that this is the r-fold product of the probability generating function of a geometric random variable.
G(z) = \textrm{e}^{\lambda(z - 1)}.\;\,

Related concepts

The probability-generating function is an example of a generating function of a sequence: see also formal power series. It is occasionally called the z-transform of the probability mass function.

Other generating functions of random variables include the moment-generating function, the characteristic function and the cumulant generating function.